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Abstract 

We sharpen the Evolving set methodology of Morris and Peres and extend it to study conver- 
gence in total variation, relative entropy, L'^ and other distances. Bounds in terms of a modified 
form of conductance are given which apply even for walks with no holding probability. These 
bounds are found to be strictly better than earlier Evolving set bounds, may be substantially bet- 
ter than conductance profile results derived via Spectral profile, and drastically sharpen Blocking 
Conductance bounds if there are no bottlenecks at small sets. 

Keywords : Mixing time, evolving sets, blocking conductance, spectral profile, conductance. 

1 Introduction 

An isoperimetric bound on mixing time uses a geometric quantity, such as conductance, to bound 
the rate of convergence of a Markov chain. Such bounds have played a key role in proving mixing 
time results, beginning with Jerrum and Sinclair's [5] proof that a random walk for approximating the 
permanent of a dense matrix converges in polynomial time. Their idea has been extended to apply to 
non-reversible non-lazy walks [9, 3], to continuous state spaces [8], to walks with low conductance on 
small sets [8], and to walks with high conductance on small sets [7]. 

Three recent papers have built on the Average Conductance idea of Lovasz and Kannan [7]. 
Morris and Peres [16] develop the Evolving Set methodology to show very strong results in terms of 

distance. Kannan, Lovasz and Montenegro [6] show similar results for total variation distance of 
a reversible, lazy walk through the method of Blocking Conductance. Finally, Goel, Montenegro, and 
Tetali [4] use the notion of Spectral Profile to extend an approach of Fill [3] and bound mixing of 
finite Markov chains. Each of these were shown by very different methods: by using a duality based 
approach, by considering the n-step average distribution, and by direct examination of the drop in 
variance, respectively. 

The goal of this paper is to develop a general framework under which these isoperimetric results 
are unified as much as possible. This will be done by strengthening the Evolving Set methodology. 
Our improved argument leads to bounds on any convex notion of distance: including total variation, 
relative entropy, L'^, Hellinger, and Wasserstein distances. These are the first isoperimetric bounds on 
most of these distances, and even when past bounds are known these are the first which are sharp. 
For each of these distances we can also derive bounds in terms of an extension of the conductance 
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method, known as modified conductance, which is consistent with past bounds when apphed to lazy 
walks but which also applies in the setting of walks with no holding probability. 

How do our new Evolving Set results compare to previous isoperimetric bounds? We find that our 
new mixing bound is slightly better than earlier Evolving Set results, our conductance bounds on 
mixing may be substantially better than those derived from Spectral Profile bounds, and our mixing 
bounds are significantly sharper than those of Blocking Conductance except when the worst bottleneck 
is at a small set. Moreover, our results explain the curious existence of three total variation mixing 
bounds in the Blocking Conductance paper [6]. We find these are in fact total variation extensions 
of a bound on mixing, a bound on relative entropy mixing, and a direct bound on total variation 
mixing. 

This paper is focused on developing a rich theoretical framework, and comparing it to past methods. 
As such we give only one "new result," in Example 5.5, where we prove the first mixing bound for the 
(non-lazy) simple random walk on an undirected graph, and more generally on a directed Eulerian 
graph. Other applications are left to companion papers, which we briefly describe here. In [11] we 
show a version of Cheeger's inequality which bounds (complex- valued) eigenvalues of non-reversible 
chains, a version to bound the smallest eigenvalue of a reversible chain, and we also sharpen Cheeger 
inequalities of Jerrum and Sinclair [5], Alon [1] and Stoyanov [17] for bounding the spectral gap in 
terms of isoperimetric measures of edge and/or vertex expansion of a non-reversible walk. In [12] we 
study general walks on directed graphs and use modified conductance, along with ideas of [11], to show 
near-optimal bounds on spectral gap, (complex-valued) eigenvalues, and both total variation and L°° 
mixing times. For instance, we find that among simple and max-degree walks on directed Eulerian 
graphs (i.e. in-degree=out-degree) with a weak expansion condition, a walk on a cycle with clockwise 
drift is within a small constant factor of having the smallest spectral gap, largest non-trivial (complex- 
valued) eigenvalue, and slowest mixing time in total variation and L°° distances. In [10] we show 
canonical path and comparison theorems in terms of edge congestion, edge and vertex congestion, or 
a mixture of isoperimetric and congestion methods. Finally, together with Tetali [14] we substantially 
improve on mixing time bounds of Morris for the Thorp shuffle [15], by use of a conductance-profile 
bound based on ideas developed in this paper for walks with no holding probability. 

The paper proceeds as follows. In Section 2 we introduce the notion of Evolving sets, and use 
this to show isoperimetric bounds on distances and mixing times. This is followed in Section 3 by 
conductance and modified conductance, an extension of conductance to non-lazy walks. The new 
results are compared to previous isoperimetric methods in Section 4. Finally, in Section 5 a few 
examples are given to illustrate sharpness, and the mixing bound for the simple walk on a directed 
Eulerian graph is also considered. The Appendix contains proofs of identities and inequalities which 
were required for our results. 

2 Set bounds on distance and Mixing Times 

In this section the main developments of this paper are given, isoperimetric methods for bounding 
several notions of distance and mixing time. The arguments are based on the evolving set process of 
Morris and Peres [16] which was also described in the context of duality by Diaconis and Fill [2]. 

A little notation is required. Let P be a finite irreducible Markov kernel on state space V with 
stationary distribution vr, that is, P is a |y| x |F| matrix with entries in [0, 1], row sums are one, V is 
connected under P (i.e. Va;,|/ G y3t : P*(a;,y) > 0), and vr is a distribution on V with vrP = vr. The 
time-reversal P* is given by P*(x,y) = ^^^^gf^ and is a Markov chain with stationary distribution 
TT as well, li A^B dV the ergodic flow from A to B \s given by Ql{A,B) = X^xe/l ysB ^(^)'^(^' y)- 
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Given initial distribution cr, the n-step discrete time distribution is given by crP", and if the walk is 
aperiodic then crP" "^""i tt. 

2.1 Duality and Evolving sets 

The key to our results is a dual process. Given a Markov chain on V with transition matrix P, a dual 
process consists of a walk P on some state space V and a link, or transition matrix, A from V to V 
such that 

PA = AP . 

In particular, P"A = AP"^ and so the evolution of P" and P'^ will be closely related. 

In order to relate a property of sets (set expansion) to a property of the original walk (mixing 
time) we construct a walk on sets that is a dual to the original Markov chain. A natural candidate 
to link a walk on sets to a walk on states is the projection A{S, y) = ^s{y)- Diaconis and Fill [2] 

have shown that for certain classes of Markov chains that the walk K below is the unique dual process 
with link A, so this is the walk on sets that should be considered. We use notation of Morris and Peres 
[16]. 

Definition 2.1. Given set A C V a step of the evolving set proeess is given by choosing u G [0, 1] 
uniformly at random, and transitioning to the set = {y G V : Q{A,y) > uir{y)}. The walk is 
denoted by So, Si, S2, ■ ■ ., Sn, with transition kernel K"'{A, S) = Prob{Sn = S\So = A). 

The Doob transform of this process is the Markov chain on sets given by k{S, S') = ^ K{S, S'), 

with n-step transition probabilities K"(5, S') = ^ K"(5, S'). 

The Doob transform produces another Markov chain because of a Martingale property. 



Lemma 2.2. If A C V then 

■k{Au) du = ^{A) . 
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Proof. 



C 7r(A„) du = y^ iTiy)Prob{y G A) = V vr(y)^^ = 7r{A) . 



□ 



The walk K is a dual process of P. 

PAiS,y) = Ak{S,y). 



Lemma 2.3. If S CV, y eV and A{S,y) = ^M-ls{y) is the projection linkage, then 



Proof. 

AK(5, ,) = K(5, 5') ^ = ^ 2 K(S, S') = ^ 

S'^y ^ ^ ^ ' S'^y ^ ' 

The final equality is because ^gt^y K(>S', 5') = Prob{y G S') = Q{S, y)/7r(y). □ 



3 



With duality it becomes easy to write the n step density in terms of the walk K. 
Lemma 2.4. Let E„ denote expectation under K". If x eV and Sq = {x} then 

P"(x,2/) =E„7r5„(y), 

where Trs{y) = ^^^t^^s)^^ denotes the probability distribution induced on set S by tt. 
Proof. 

P^{x,y) = P^A{{x},y) = AK-({x}, y) = E„7r5„(y) 
The final equality is because A{S,y) = 7rs{y). □ 

2.2 Evolving set bounds on distances 

It is now a short hop from Lemma 2.4 to a bound on mixing times. First, however, note that if a 
distance dist{fi,TT) is convex in fj, (i.e. dist{afii + (1 — a)/X2,7r) < adist(/xi, tt) + (1 — a)dist{fj,2,'^)), 
then for any distribution a 

dist{aP"',7r) = dist ( V cj(a;)P"(x, •), vr ) < V o-(x)(iist (P"(x, •), vr) < maxdzst(P"(x, •), tt) . 
\xev / xev 

In this case distance is maximized when the initial distribution is a point mass, i.e. a{y) = Sy=x for 
some X E V. Given the preceding lemmas it is easy to show an evolving set bound for all convex 
distances. 

Lemma 2.5. Consider a finite Markov chain with stationary distribution tt. Any distance dist{pL,'iT) 
which is convex in jj, satisfies 

dist{P'^{x,-),7r) < E„disi(7r5„,7r) 

whenever x & V and Sq = {x}. 
Proof. By Lemma 2.4 and convexity, 

dist{P^{x,-),Tr) = (iisi(E„7rs„,7r) < Endist{Trs„,7r) . 



□ 



Many distances are used in studying mixing times. These include: 

• Separation distance: s(/x,7r) = max-r^y 1 — 

• Total variation distance: — tt\\tv = \ YlixeV Im(^) ~ ■''"(^)l 

• Relative Entropy: D{h\\it) = ^^^y /^(^) ^ 



irix) ^ 



L2 distance: ||^ - lUs,^ = )J^xeV^ix) 
Relative Pointwise distance (L°°): ||^ — = raaxxev 



■k{x) 
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• Hellinger distance: H{n, vr) = J2yeV {\f^ 

• Wasserstein distance Wp{fi,Tr): Given metric d : V xV ^ M"*", let 

WP{fi,Tr)= sup + Kg 

f,g:V^R, 
'iy,zeV: f(y)+9(z)<d{y,z)P 

Each of these distances can be bounded easily with Lemma 2.5. 
Theorem 2.6. Given a finite, ergodic Markov chain, x,y eV and Sq = {x}, then 



||P"(x,-)-7r||Ty < E„(l-7r(5„)) 



D(P"(x,-)||7r) 



< E„loe 



7r(S'„ 



P"(x,-)-7r||2,^ < 

P«(a;,y) 



< max 



'l-7r(5„ 
7r(5„) 
1 - ■K{y) 



7r(y) 



,l \ Proh^„{Sn^V) 



7r(y) 
i^(P"(^,-),7r 

Typ(P"(x,-),vr) < ^]E„<(7r(50,vr) 

Most of these are immediate from the lemma and computation of dist{Trs,'^)- For instance, in the 
total variation case ||vrs — 7r||rv = 1 — '''"('S')- 

A few cases are worth mentioning further. The relative pointwise bound is because dist{iJ,, it) = 



1 



is convex, with 



dist{7Ts,Tr) = 



7r(y) 



^ , , 1 — vr(5) , , , 
= 7^V~ + ^s4y) < max 



1 - T^jy) 



1 > Ss^v ■ 



The Hellinger distance is a special case of dist{fx, vr) = C^^ (^) for a convex functional C^^ : (]R+)^^ M. 
Wasserstein distance is a case of C.j^{f ) = sup/jg^ "Ylyev ^iv) fiv) "^(y) some class of functions H, 
by rewriting as 



W^{P^{x,-U) = 



sup 

'^y,zeV: f{y)+9{z)<d{y,zr 



j:{f{y) + K9) (^) Ay) 



One case of the Wasserstein distance is worth mentioning. If d{y, z) = 6y^z then Wp is just the 
total variation distance. It is easily checked that Wp'(7r5,7r) = 1 — 7r(5') in this case, and so 

||P"(x, •) - ttIItv = %^(P"(a;, •), vr) < E„W^P(7rs„, vr) = E„(l - 7r(5,)) , 

which shows the Wasserstein bound generalizes the total variation bound. 
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Remark 2.7. When the initial distribution a is not a point mass then Sq should be chosen from 
a distribution. Set (Tq = cr. Inductively define Ai = {x : cr.i(x) > 0}, let Prob{So = Ai) = 
7r{Ai) min^eAi and ai+i{x) = ai{x) - lAi{x)^^ Prob{So = A^). Then Lemma 2.4 general- 

tzcs to 

The results in Theorem 2. 6 generalize to this case as well, whereas those in the next section will replace 

TT, With f-\¥.f{'K{So))). 

2.3 Mixing times 

Throughout this section assume that the distance to be studied is of the form 

dist(P"(a;,-),7r) <E„/(7r(5„)) 

for a decreasing function / : [0, 1] — > 1R+. For instance, the total variation, LP and relative entropy 
bounds in Theorem 2.6 are all of this form. Let T(e) denote the mixing time in this distance, that is, 
the minimum number of steps to guarantee that this distance is at most e. 
Mixing time will be bounded using the /-congestion. 

Definition 2.8. Given a finite Markov chain, and function / : [0, 1] non-zero except possibly 

at and 1, then the /-congestion Cf and /-congestion profile Cf{r) are given by 

\/A(lV:Cf{A) = ^-^-iP^^^, Vr>0: C/(r)= max C/(^), C/=C/(1). 

The starting point for our calculations will be the following discrete analog of differentiation. 
Lemma 2.9. 

En+l/(7r(-S„+i)) - Enf{n{Sn)) = -Enf{7r{Sn)) (1 - Q(a)(.5„)) 

< -(l-Q(„))E„/(7r(5n)) 

Proof. The inequality is because VS C F : 1 — Caf(^a) < 1 ~ ^af{a) ('5) , by definition of C^f^a) ■ For the 
equality, 

En+i/(7r(5„+i)) = E„ '^('5"' S)fi7r{S)) 

s 

= En/(^(5n)) ^^ ""f.^l^fJl"^^^^ = En/(^(5n))Ca/(»)(gn) 

□ 



A basic mixing time bound follows easily: 
Corollary 2.10. In discrete time 



rie)< 



1 - Caf(a) 



log 
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Proof. By Lemma 2.9 E„+i/(7r(S„+i)) < E„/(7r(5„)). By induction E„/(7r(5„)) < /(7r(5o)). 

Solving for when this drops to e and using the approximation logCaf(^a) ^ ~(l~Ca/(a))) gives the corol- 
lary. □ 

This can be generalized to take into consideration set sizes. A stronger bound holds under a fairly 
weak convexity condition, with about a factor of two lost in the general case. 

Theorem 2.11. If x (l — Caf(a){f~^{x))) convex then 

rf-He) -f'{x)dx 



r{e)< 



/(X)(1-Q(„)(X)) 



while in general 



-2f'{x) dx 

)/2) f{x){l-Caf{^a){x)) 



Proof. First consider the convex case. 

By Lemma 2.9 and Jensen's inequality for the convex function x (l — Caf(a){f~^{x))), 

E„+i/(7r(S„+i)) - E„/(7r(S„)) = -E„/(7r(5„))(l-C„;(„)(5„)) 
< -E„/(7r(5„)) [1 - (/-I o f{7r{Sn)))] 



< 



Kf{7r{Sn)) 1 - C^fia) (f-\Enf{ASn))) 



(1) 



Since I{n) = ]E„/(vr(S'„)) and 1 —Caf[a){f ^(^)) non-increasing, the piecewise linear extension of 
/(n) to t G M+ satisfies 

I'{t)<-I{t) [l-Q(„)(/-^(7(i)))] (2) 

At integer t the derivative can be taken from either right or left. 
Then, 

r^w di 

A(0) I{l-Caf(a){f-\I))) 

A change of variables to v = f~^{I) implies that 



< 



dt 



-t. 



nv)dv 



f-Him fiv)il-Caf^a){v)) 



< -t. 



By continuity of I{t) there exists T such that I{T) = e. The theorem follows from / ^(/(O)) = 

/-^(/(vr,)) = and f-Hl{T)) = f-\e). 

For the general case, use Lemma 2.12 instead of convexity at (1). □ 

Lemma 2.12. If Z >0 is a nonnegative random variable and g is a nonnegative increasing function, 
then 

EZ 

E{Zg{Z))>—g{EZ/2). 

Proof. See [16]. Let A be the event {Z > EZ/2}. Then E{Z Iac) < EZ/2, so E{Z1a) > EZ/2. 
Therefore, 

EZ 

E{Zg{2Z))>E{ZlAg{EZ)) > —g{EZ). 



Let U = 2Z to get the result. 



□ 
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It is fairly easy to translate these to mixing time bounds. For instance, if /(a) = y then by 
Theorem 2.6, Corollary 2.10 and Theorem 2.11 the L^-mixing times (denoted by T2(e)) are: 



7-2 (e) < < 



log 



1 — TT* 



2(1 -C^^) ^.e 



in general 



1 



dr 



Ml-0(1-C^^(r)) 



if r 1 - C 



-^a(l-a) V 



IS convex 



4^. r(l-r)(l-C 



l+37r* 



(0) 



in general 



By making the change of variables x = and applying a few pessimistic approximations one obtains 
a result more strongly resembling average conductance bounds: 



T2(e) < < 



l-C 



^/a{l-a) 



log- 



in general 



dx 



, 2x(l 

4/e2 
47r. - 



•^a(l-a) 

dx 



(^)) 



if X 1 - C 



1 



is convex 



in general 



It is often unnecessary to compute Cj{r) for r > 1/2. Observe that u almost everywhere {A'^)^ = 
It follows that if /(a) = /(I - a) then 



CM) = f 

Jo 



' /(7r(Ai_„)) 
/(^(^)) 



du 



du 



MAC)) 



du = Cf{A^ 



(3) 



/(7r(Ac)) 

In particular, Vr > 1/2 : C/(r) = C/(l/2) = max^(^)<i/2 ^/(A). 

Remark 2.13. Mixing time hounds implied by the theorems of this section follow easily for the other 
distances, but for instance with Ca(i-a) for total variation distance and Caiog(i/a) for relative entropy. 
However, it is often better to work with a harder distance, such as bounding total variation mixing 
(ttvI^)) by instead bounding mixing (T2{e)) and applying the relation Ttv{^) < '^2(2e)- The quan- 



tities are related by C a iog{i/ a) {A) < \ {\^Ca(\-a){A)) (see Appendix) and C ^j-^^^—^^A) < ^Ca(i-a){A) 

(Cauchy- Schwartz), so generally the relative entropy or L"^ -mixing bounds are less than a factor two 
worse than the total variation bound. In contrast, the lazy walk on a binary cube {0, 1}'' has tiny 



1 - Can-a)i{x}) ~ ^ 2-^ but huge 1 - C 



-'a(l— a) 

better asymptotics for this example. 



so the L? bounds will give much 



2.4 Continuous Time 

Not much need be changed for continuous time. Let Ht = e"**^'"^^ denote the continuous time Markov 
chain at time t. It is easily verified that if = e"*^'"'^) then 

Ht{x,y) = Etirst 
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where = {x} and E( is the expectation under the walk Kj. Bounds involving P'^{x,y) then translate 
directly into bounds in terms of Ht{x,y). Once Lemma 2.9 is replaced by 

^Etf{7T{St)) = -Ei/(7r(5t))(l -Q(„)(5t)) 

then mixing time bounds also carry over to the continuous-time case, although it is no longer necessary 
to approximate by a derivative at (2) nor necessary to take the ceiling of the bounds. 



3 Conductance and Modified Conductance 

The most common geometric tool for studying mixing time is the conductance a measure of the 
chance of leaving a set after a single step. Such bounds have been shown only for mixing time. In 
this section we show bounds on /-congestion in terms of conductance for lazy walks, the most common 
situation. The real innovation of this section, however, is the modified conductance, an entirely new 
quantity which is equivalent to conductance for a lazy walk in distance, but which also applies to 
walks with no holding probability and to other distances as well. 



3.1 Conductance 

Let us begin with a formal definition of conductance. 

Definition 3.1. The conductance $ and conductance profile 4(r) are given by 

Q(A A'^) _ _ , 

yAcV : <^{A) = /, , Vr > : $(r) = min <^(A) , $ = $(1/2) = min <^(A) . 

7r(A)7r(^'^) 7r(A)<r AcV 



The conductance $ and conductance profile $(r) are defined similarly, but in terms of ^{A) 
Q( 

min{7rl 

chain 



Q(A 

min{T(A) 7r(Ag)} " When ncccssary, notation such as $k will be used to denote conductance for Markov 
in K. 



The conductance profile $(r) can also be used to upper bound the various /-congestion quantities 
Cf when the Markov chain is lazy. The argument is not hard (see also [16]). 

Theorem 3.2. Given a lazy Markov chain, and f concave, then 

r (^^ < /(7r(A) + 2Q{A, A'^)) + /(7r(A) - 2Q{A, A'^)) 

Proof. For a lazy chain, if n > 1/2 then Au C A, and so 

By the Martingale property 'K{Au)du = 7r(A) it follows that 

/ Tr{A^)du = 7T{A)- T,{A^)du = ^ + Q{A,A''). 

Jo Jl/2 ^ 
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Recall Jensen's inequality, that J g o h{u) du < g{J h{u) du) if n is a probability distribution and g is 
concave. By concavity of /, 



□ 



For each choice of / a bit of simplification leads to bounds on Cf. For instance, a lazy Markov 
chain will have 

' (4) 



— log , 

$2 e-yjTU, 



See proof of Theorem 3.5 for a similar calculation. 

A conductance bound for a non-lazy walk will be considered later. 

3.2 Modified conductance 

While the conductance has proven useful for studying lazy walks, if the chain is not lazy then the 
conductance $(r) is not useful for studying mixing. Consider the simple random walk on the complete 

bipartite graph Km,rm a periodic Markov chain. Every subset A C K„i^rn has many edges to A'^ so 
conductance is large, but if A is one of the bipartitions then a Markov chain starting in A will bounce 
from A to A'^ and back again, but it will never mix. 

The problem here is that the Markov chain never grows into a larger set, but is always stuck in 
half of the space. Therefore, it seems more appropriate to consider how much flow from A reaches a 
strictly larger set, that is the worst flow into a set B where 7r(i?) = Tr{A^). In particular, we consider 
-^(A) = *(A,7r(A^)) where 

^{A,t)= min Q{A,B) + {t-7r{B))^^^ (5) 

Bcv,v&v, 7r(f) 

7r{B)<t, n{BLIv)>t 

is the smallest flow from A to a set of size t. For a lazy chain the minimum in '^{A) occurs at B = A"^, 
so '^{A) = Q{A, A'^). In general, if vr is uniform then "^{A) simplifies to "^{A) = min^i-^j^^i-y^c) Q(^, B). 
It is now possible to define the set quantity that is the main innovation of this section. 

Definition 3.3. The modified conductance (j> and modified conductance profile ^(r) are given by 

= MwL^ ' ^^''^ = Vl^ '^^^^ ' ^ = ^( V2) = min ^{A) . 
Tr[A)ir[A'') 7r(A)<r AcV 

Define (t){A) similarly but without Tr{A'^) in the denominator. 

For a lazy chain '^{A) = Q{A,A'^) and so (j){A) = ^{A), and modified conductance extends 
conductance to the non-lazy case. The modified conductance captures important properties quite 
well. For instance, a connected reversible chain has ^(A) = if and only if A is one of the bipartitions 
of a periodic walk; the minimum in ^{A) is then achieved by i? = ^, and ^{A) = Q{A, ^4) = rather 
than Q(^, A"^) > as with conductance. 

An alternate interpretation of '^{A) is as follows. Given a set ^ C F let pA € [0, 1] satisfy 

inf{y : niAy) < 7r{A)} < pa < sup{y : iriAy) > 7r{A)} . 
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The set V\Apj^ contains the vertices with minimum flow from A, and so if u < pA then 7r(j4„)— 7r(j4) = 
7r({y eV\ A^^ : Q{A,y) > UTr{y)}). It foUows that 

^{A)= {Ti{Au)-^{A))du= i7r{A)-7TiA^))du = - {tt { A) - 7r{ Au)\du , (6) 

Jo JpA ^ Jo 

where the first equahty is from the definition of ^(^) and the second is from Lemma 2.2. Since 
ii-almost everywhere A^^ = the final equality shows that ^'(A) = '^{A'^), a property which 

is also satisfied by conventional set expansion with Q{A,A'^) = Q{A'^,A). 

3.3 An Inequality Prover 

With this formulation of ^{A) it is possible to upper and lower bound each Cf{A) via Jensen's 
inequality, although the upper bounds require a careful setup. However, an argument based on Lemma 

3.4 is more appealing because it gives a general result for all concave /, and it immediately implies 
sharpness due to the explicit constructions (9) and (10). Moreover, it greatly simplifies proofs of other 
isoperimetric bounds (see [11, 10] for other applications). 

Lemma 3.4. Given a concave function / : [0, 1] — M and two non-increasing functions g, g : [0, 1] — 
[0, 1] such that g{u) du = g{u) du and G [0, 1] : Jq g{u) du > g{u) du, then 



/ / o g{u) du< f o g{u) du . 
Jo Jo 



Proof. The concavity of /(x) implies that 

yx>y,d>0: f{x) + f{y)>f{x + d) + f{y-S). (7) 

This follows because y = X (y— 5)+(l— A) {x+5) with A = 1— G [0, 1] and so by concavity f{y) > 

Xf{y-S) + {l-X)f{x+6). Likewise, x = (l-X) {y-6)+X{x+6) and/(x) > {1-X) f{y-6)+X f{x+6). 
Adding these two inequalities gives (7). 

The inequality (7) shows that if a bigger value (x) is increased by some amount, while a smaller 
value (y) is decreased by the same amount, then the sum f{x) + f{y) decreases. In our setting, the 
condition that \/t G [0,1] : g{u) du > j^g{u)du shows that changing from g to g increased the 
already large values of g{u), while the equality g{u)du = J^g{u)du assures that this is canceled 
out by an equal decrease in the already small values. The lemma then follows from (7). □ 

The lemma implies that for any set A G V, and for some initial conditions, if there are non- 
increasing functions m, M : [0, 1] i-^ [0, 1] such that 

yt G [0, 1] : / M{u) du> [ 7r(y4„) du > [ m{u) du (8) 
Jo Jo Jo 

and / M{u) du = vr(Au) du = m{u) du 
Jo Jo Jo 

then for every concave function f{x) it follows that 

i^7(M(u))dn ^^^^ Jif{m{u))du 
f{n{A)) ^^f^"")^ fi^iA)) ■ 

In the problem at hand, tt{Au) G [0, 1] is non-increasing and equation (6) implies ^{A) is the area 
below tt{Au) and above vr(yl), and also above n{Au) and below n{A). The extreme cases of tt{Au) can 
be drawn immediately, as in Figure 1. 



11 



1 



M(u) 



n{A) 



i 








area 






*(A) 


area 


f 















Pa 


1 



u 



1 



m(u) 



7t(A)- 





area 

area 
l-CA) 

y////z////////z/////z//^^^^^^ 









Pa 1 



u 



Figure 1: Distributions such that M{u) du > /J 7r(^„) du > /J m('u) given "^{A) and pA- 



3.4 Bounds on /-congestion Cf{A) 

We now show modified conductance bounds on some of the /-congestion quantities of interest. 
Theorem 3.5. Given a subset A CV then 

4>iA) > > 1 -^1- 0(^)2 > 0(^)2/2 

2<p{Af 



HA) > '-'alog(l/a) (A) > 



log(l/7r(yl)) 



HA) > > A4>{A)MA){l-HA)) 

Proof. For the upper bound, Figure 1 shows that, given ^-CA) then Vt G [0,1] : jQM{u)du > 
/q 7r(74u) (iw and M{u) du = 7t{A) = tt{Au) du, where 



M{u) 



if->i-|ff 

-iA) ifuG(^, 

1 if-<T5^ 



(9) 



By Lemma 3.4 any choice of f{z) which is concave and non-negative will therefore satisfy 

Jif{M{u))du 



CfiA) > 



fiHA)) 
m) /(O) 



HA) HHA)) 
> I -HA) 



+ 1 



^{A) \ f{7r{A)) ^ ^{A) /(I) 



HA)HA'')J fiHA)) i-HA)f{HA)) 



This shows all of the upper bounds. 

To prove lower bounds, suppose pA_ and ^{A) are known. Then Figure 1 demonstrates that 
Vt G [0, 1] : Jq 7r{Au) du > m{u) du and m{u) du = 'n{A) = /J^ t^{Au) du, where 



HA)-^^ ifn>PA 

PA 



m{u) 



HA) + ^ iiu<pA 



(10) 
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All that remains is to substitute this into the formula for l — Cf{A) for the various f{x) of interest, 

and then minimize over all possible pA G [0, 1]. 

The bound on 1 — C^i^i-a) is the easiest. Apply Lemma 3.4 with f{z) = a(l — a) to obtain 



Ca{l-aM) < 



V ' 40/1 \ ' 13A 



PA 
1 



Tr{A) 1 - Tr{A) 

^{Af 7r{A)TT{A'') 



+ (1 - PA, 



^^A)-P^l-n{A) + P^ 



7r(A) 



1 - 7r(A) 



Pa(1 - Pa) 

For the lower bound on Cg^\^^[xla) proceed similarly. 



<\-^4>{AY'k{A)'k{A^) 



•'alog(l/a) 



< -pA 



7r(A) + 



PA 



A^) log - 



^ log 7r(A) + 



PA 



(1 - PA, 



7r(^) 



^(^) log - 



^ log ( 7r(^) - ^ 



PA 



PA + 4>{A) PA + 1 - PA - '/'(^) , 1 - PA - '^(^) 

1 ^ , log log ■ 



log^n 



7r(A) 



PA 



log^T 



7r(A) 



1 - PA 



Then (l-C,iog(i/a)(^)) log(l/^(A)) > <7(Pa, 0(^4)) > 24>[Af , where = (x + y) log ^ + (1 - 

X — y) log > for x € [0, 1], y G [0, 1 — x] (see Appendix for proof of g{x, y) > 2j/^). 



Now for C 



(A). Applying Lemma 3.4 and equation (10) as before, 



< PAA 1 + 



PA 7r(A) 



PA 7r(Ac) 



+ (1-Pa)a 1 



*(A) 



il-pA)7r{A) 



1 + 



nA) 



(1 - pa) Tr{A^) 



(pa + 4>{A) 7r{A-)^ (pa - 4>(.A) 7r(^)) + W (l - p^ - ^A) 7r(^^)) (l - Pa + ^(^) 7r(^)) 



It is shown in the Appendix that \^XY+ y^(l-X)(l-y) < ^1 - (X - y)2 when X, y G [0, 1]. Let 
-'^ = pA + ^(^) T^iA'^) and y = pA - ^(^) 7r(74). The bound on C ,^ a{i-a) ^^^ follows immediately. □ 



It follows, for instance, that 
T2(e) < 



T7log- 



^2 



and T2(e) < 



2dr 



47r. r^(r)2 



(11) 



Conductance can be used to obtain a crude lower bound on the modified conductance. 
Lemma 3.6. For an ergodic Markov chain, ifix G V : P(x,x) > 7 G [0, 1] and AdV then 



^{A) > ^{A) > min <^ 1 



7 

1-7 
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Proof. The upper bound i^(^) < ^{A) is trivial because "^{A) < Q{A, A'^). The minimum in the lower 
bound is equal to 1 exactly when 7 > 1/2, but in this case 0(A) = ^{A), so this case is also trivial. It 
remains to consider the lower bound when 7 < 1/2. 

In the definition of ^{A) there is a set B, and one vertex v for which only a ^^^^(l'^^^'' fraction 
is counted. Extend the state space F to a space V by splitting v into two vertices vi and V2, with 
vi of size vr(A'^) — tt{B), V2 with the remainder, and ergodic flows into vi and V2 determined by their 
respective sizes. Then let C = B U vi be the set where ^{A) = Q{A, C). It follows that 

*(A) = 

> 

> 
> 

The first inequality uses the fact that \iv G A: Q{v,v) > 77r(v) and so Q{A,v) > jtt{v). The second 
inequality is because 7r(C n A) = 7r( \ C7) > ^((^'\^rA'\C) > QiA^ . □ 

The 7/(1 — 7) factor is introduced when converting C Ci A into a subset of A'^, in short primarily 
because Q{A, A'^) is not the correct quantity to work with for non-lazy chains. This induces a mixing 
bound in terms of conductance for non-lazy walks, but this will be substantially improved on later. 



Q{A,CnA'') + Q{A,CnA) 
Q{A,CnA'')+jTr{CnA) 

Q(AA-\C) 



Q{A, c n A"^) + 7 ■ 
7 



1-7 



1-7 



4 A comparison to previous isoperimetric bounds 

How do our new results compare to previous isoperimetric bounds? In this section we compare our 
new Evolving set mixing bounds to earlier Evolving Set bounds, to Spectral profile bounds, and to 
Blocking Conductance results. 



4.1 Evolving Sets 

Morris and Peres' used a more probabilistic argument than ours to show that if x G F and Sq = {x} 
then 

lipn. ^ ^TC- min{V7r(5n), ^/l-^^{Sn)} 

not a major difference but up to \/2 times weaker than our bound in Theorem 2.6. They did not have 
bounds on total variation or relative entropy. 

Our rate of contraction C on distance is also better than the that they showed. 

Let f{x,y) = ■sj^ ~ with domain x,y G (0,1). This is convex in x because -^f{x,y) = 

— ^ yi-y y Q rpj^gj^ |-,y Jensen's inequality, 

A{x{l-x)Yl^y/y(l-y) 



showing that C^^^(A) < C^(A). 
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4.2 Spectral Profile 

Two isoperimetric bounds on mixing time are shown in the Spectral Profile paper [4]: 



r2{e) < 



4A^ 4dr 



47r. r$pp*(r) 



and T2(e) < 



2dr 



(12) 



The holding probability 7 G [0, 1] is such that \/x : P{x,x) > 7. 

Our commentary will show the Evolving set bounds are at least as good as the Spectral profile 
L? bounds above. However, keep in mind that the Evolving set bounds apply to other distances, 
such as total variation and relative entropy, for which Evolving sets give the only known conductance 
bounds on mixing which are not simply induced from mixing bound. See [12] for an example where 
modified conductance is used to show a total variation mixing bound which is strictly better than the 
L? mixing bound. 

First, we show that bounding mixing time with modified conductance is no worse than using the 
multiplicative reversibilization PP* in (12). However, the real lesson of this result is that modified 
conductance bounds can give a substantial improvement. In particular, in the proof of Lemma 4.1 
we give an explicit construction in which ^{A) = ^y^pp*{A), and so in the worst case scenario 
$PP*(r)^ = 0(r)^, and the first bound of (12) may be nearly as bad as the square of the modified 
conductance mixing bound! 

Lemma 4.1. 

\/^pp*{A) > 4>iA) > 1 - ^i-ipp,(^) > ^ ipp*(A) . 

Proof. To simphfy notation, in the definition of ^{A) assume that the set B satisfies tt{B) = Tr{A'^), 
i.e. ^{A) = Q{A,B). The general case is similar. 
To begin with, we need a few identities: 

Q{A,B) = niA) - CKAB") = n(A} - (t(B=) - Q(^«,B')) = Q(4=,S«) 

QPP.(A,.4., = Y. QiA,yWiy,A') = E ^ (' " ^) 

yev yev ^ ^ 

We split this final summation into a sum over vertices in B and a sum over those in B'^, and then 
upper and lower bound these using Lemma 3.4. First, consider the sum over B. To apply the lemma 
we need to replace the summation by an integral. Sort the vertices in B as {yi}, so that '^'^^^ is a 

decreasing function of i. Define g{u) = '^^y^^'^ where k is chosen so that ^^^^(g)^'''* < u < 

and let g{l) = 0. Then g{u) is a decreasing function, so Lemma 3.4 can be used to upper and lower 

bound f{g{u)) du where f{x) = x{l — x) is concave. 

Two conditions will be used in our application of Lemma 3.4. First, 



[ g{u) du = y^ 



Q{A,y) Trjy) ^ QjA, B) ^ ^(^ 
7r(y) 7r{B) 7r{B) 7r{A-) 



Second, set B contains the vertices where ^^^'""^ is smallest, that is, Vy G B'^, v £ B : ^^"^'^^ > 



men vy t : ' - 

QiA,B') _7r{A)-Q{A,B) ^ ^(A) 



Then My £ B" : > max„5(ti) = g{Q) and so Q(A, 5^) > 7r(5'=)5(0) = 7r(A)5(0), hence 



5(0) < 



tt{A) t^{A) tt{A) 
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Prom these two conditions it follows that G [0,1] : M2{u) du > Jq g{u) du > jQm2{u)du 
when 



m2{u) 
M2{u) 



for all u G [0, 1] 



if fx < 



if u > 



^(A)/7r(^'=) 



^(A)/7r(A°) 
,_*(A) 



By Lemma 3.4 



E 



Q(Ay) /^J_Q(Ay)^ 7r(y) 



7r(y) 



7r(y) 



Likewise, 



It follows that 



E 

y&B 



7r(y) J Tr{B) 



7r(A)7r(^c) 



1 - 



^ Q{A,y) ( Q{A,y) \ ^A? 



To show similar bounds for the sum over first re-arrangc the terms as 



^ Q{A,y) Q{A,y)\ 7r(y) 



7r(y) J Tr{B 



vr(y) V <y) J 



A similar argument applies again (recall Q{A'^,B'^) = showing that 



<A) 



7r(y) 



7r(y) 



Tr{A<' 



Combining the two relations shows that 



^{A){2-^{A))>Qpp,iA,A'')> 



'^{Af 



7r(^)7r(Ac) ■ 

Dividing through by tt{A)tt{A'^) and then re-arranging the inequalities completes the proof. 



□ 



The lemma induces mixing bounds in terms of $pp*(r) for total variation, relative entropy and 
distance. For instance, 



T2(e) < 



J 47rt 



4/^' 8dr 



ripp*(r)2 



and T2{e) < 



$2 



log 



1 



(13) 
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This is not directly comparable to the Spectral profile bound, but it is never more than a factor two 
worse, and is strictly better when x4>pp* ^j:^) is convex as is often the case. 

In a survey with Tetali [14] we use a more specialized method, applicable only to 1 — C ^ ^{i-a) ^^^^ 
to show that 

This gives exactly the same mixing bound as the Spectral Profile result, and can be improved by a 
factor two when x^pp, ^jq^^ is convex. 

Now, consider the second bound of (12), with a holding probability. It is possible to use modified 
conductance via Theorem 3.5, combined with Lemma 3.6 to bound /-congestion in terms of conduc- 
tance for non-lazy walks. However, this is far from the Spectral profile bound. We now give a more 
direct argument improving substantially on this. 

Lemma 4.2. Consider a Markov chain with Vx G F : P{x,x) > 7 G [0, 1]. If A CV then 



(1-7)2 log(l/7r(A)) 

Proof. The upper bounds follow trivially from Theorem 3.5 and the relation '^{A) < Q{A,A'^). 
Next, suppose that 7 > 1/2. 

Let P' = 2{i~'y) ^ ^ {^^ ~ 2(1-7) ) ' sped up chain with holding probability 1/2, and denote 

its Evolving sets by A'^^ and /-congestion by C'^{A). Then, 



Au 



^</2(i-7) ifn<l-7 
A if n G [1 - 7, 7] 

."^'l-(l-u)/2(l-7) if > 7 



It follows that 

' - = ' - ^° /("^(A)^'' = " ~ "^''^"^^^ • ^''^ 

The ergodic flow of P' is '^pi{A) = Qp/(A, A'^) = and so Theorem 3.5 then induces lower 

bounds on 1 — C'j{A). For instance, 

/ 



V 



\\ \2{l-j)j -4(1-7) 



Now, suppose that 7 < 1/2. 

Observe that Jq Tr{AunA'^) du = Q{A, A'^), 7r(A„n A"^) is a decreasing function, and Tr{Aur]A'^) = 
when n > 1 - 7, and so 7r(A„ n A") du > du = Q{A, A""). Also, A^n A = A ii u < -f, 
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and so /J tt{Au) du > ^yir^A) + j^Q^A, A'^). A similar argument shows that J^_^ 7r(Au) du < ^yir^A) — 
Y^Q{A, A*^). Combining these two cases, it follows that G [0, 1] : Jq 7r(^u) du > Jq mi{u) du where 



mi{u) = { tt{A) 
n{A) 



1-7 



if ti G [7, 1 — 7] 
if u > 1 — 7 



By Lemma 3.4 if / is concave then 



7 



However, the term in parenthesis is just 1 — f{m{u))du (recall m{u) from equation (10)) when 
2^{^A) = ^^j^"^ ^ and pA = 1/2- Hence, this is lower bounded by the bound on 1 — Cf{A) given in 



Theorem 3.5 when ^{A) = ^^^z^- The lemma follows, for instance by 

/ 



1 - 



V 



. / HA) 



> 



7 



4(1-7)' 



^Af . 



(16) 



□ 



The lemma induces mixing bounds in terms of $(r) for total variation, relative entropy and Lp' 
distance. For instance, 



^2(e) < 



■^Z'" 4max{7,l -7} 
47r. TTrr$(r)2 



dr 



and T2(e) < 



4max{7, 1 — 7} . 1 
log 



J_ $2 

1-7 ^ 



(17) 



This is not directly comparable to the Spectral profile bound, but it is never more than a factor two 
worse, and is strictly better when the walk is lazy (i.e. 7 = 1/2) or x^^ i^j^^ is convex. 

4.3 Blocking Conductance 

As discussed in the introduction, our methods give new insight into the mixing time bounds of Blocking 
conductance [6]. To state the Blocking conductance theorem, recall the definition of '^{A,t) from 
equation (5), however when t>l — it{A) instead define ^{A, t) = *(A'^, 1 — t). 

Theorem 4.3. [Blocking Conductance [6]] Given a lazy, reversible, ergodic Markov chain then 



Trv{e) < 15000 
where h{x) can be any of the following: 

1 

1. Vx G [0,1] : hgi{x) > sup 



1/2 



h{x)dx + h{l/2) log2(l/2e) 



) 



Acv, ^iA)^gi{A) 

■k{A)<x 



where ipgi 



dt 
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1 ^(A'^ t) 

2. Vx G [0,1] : hmod{x) > sup — ■ — - where ^pmod{A) = / , dt 



AcV 
it{A)<x 



1 f^y^) \if(A'^ t) 

5. Vx G [0,1] : > sup — — — - where V M) = / \ 'J dt 

A<zl, xil;+{A) ^ ^ ^ <A? 

x/2<-k{A)<x 

The state space V = [0, 1] is the continuization, and is defined by associating to each v & V a dis- 
joint interval of size 7r(t;), with ergodic flow such that if dx C vi and dy C v-2 then Q{dx,dy) = 

dxP{v„V2)^y 

The large coefficient is due to a conversion from one measure of mixing time to another, and the 
need for the continuization is because the theorem is proven in the continuous space setting. 

To relate this to our Evolving set bounds we first rewrite /-congestion quantities in terms of the 
il}{A) quantities appearing in the Blocking Conductance theorem. 

Lemma 4.4. Let ^K^) = So ^(^4^ dt and 4^+{A) = ^^(^^ dt. Then, 

l-Caii-a){A) = 27r{A)TriA'')4>giiA) 



^ - Ca\og{l/a){A) = 



Proof. We work out the 1 — Ca\og(i/a){A) case in detail. 
First, rewrite things a bit. 

, . _ AA) log(l/7r(A)) - j; TTjA^) log(l/7r(A^)) du 

t^alog(l/a)l^J - vr(^)log(l/7r(^)) 

dtdu 



The second equality applied the identity it{Au) du = n{A). 

Now to rewrite in terms of Let w{t) = max{y : Tr{Ay) > t}. Then 

Jo Jn{Au) *7r(^) Jo J^^t) tTT{A) Jn{A)Jo tTT{A) 

^{A'',t) 



MA) 



dt . (18) 



The first equality was a change in order of integration, while the second applies Lemma 5.7 from the 
Appendix. 

The 1 — Ca(i-a)iA) result is shown similarly. For the 1 — C ^ a{i-a) ^^^ case, apply the inequality, 
for X, y G [0, 1] 
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and substitute y = 'K{Ay) and x = t^{A). To simplify, use the same method as in equation (18), to 
show that r{A) = i KTIfJii^f du. □ 

When combined with Lemma 2.9 it follows, for instance, that 

1 ^ . 1 



En+l log 



7!"(S'„+i) 



-E„log 



= -E„Vmoci(S'n) • 



The expectation of ipmodi^) is exactly the rate at which the evolving set bound on relative entropy 
decreases. Hence, in a sense Blocking Conductance and Evolving Set bounds are both based on 
measuring the derivative of the distance with respect to time. Not surprisingly, the Evolving set 
mixing bounds imply bounds of the Blocking Conductance form. 



^hgl{l/2)lOg—^ 



Corollary 4.5. Consider a finite (non-lazy, non-reversible) ergodic Markov chain. Then 

2C 



T2(e) < 



1/2 2 

hmod{x) dx + C hraodO-/"^) log " 



4 / /i+(a;)dx + /i+(l/2)log' 

J 47r* 



where 



7r(A)<a; 

and C is the optimal constant satisfying 



max — ■ — --, 

ACV, Xljjmod{A) 
Tr{A)<x 



h+{. 



max 



Acv, xi/j+{A) 

it{A)<x 



Vr > TT* 



mm — - — 

7r{A)<r log(l/7r(74)) 



. Ipmodi^) 
mm 

n{A)<r log(l/r) 



2 It mm^(^)<^ iog(i/7r(A)) 



min^(A)e[r/2,rl lotaMA)) • '^^^ Corollary then 



It suffices to take C 

shows that as long as the bottlenecks get sufficiently worse as set size increases, then our new Evolving 
set bounds sharply improve on Blocking conductance results. The laziness and reversibility require- 
ments are dropped, the bounds are given in terms of stronger measures of distance, and there is no 
need to work in a continuous state space. The bottleneck condition holds for most problems, but for 
instance it does not apply to certain walks used for estimating volume of convex bodies, or to Example 
4.8 below. 

The interested reader can use the quantities calculated in Example 5.3 to find that Corollary 4.5 
is within a factor 4 of being sharp for the walk on a complete graph. A "convex" version, based on 
Theorem 2.11, can be used to strengthen this to a factor 2. 

Proof. For the total variation and bounds apply Corollary 2.10 and Theorem 2.11 respectively 
to obtain mixing time bounds in terms of various 1 — Cf{A). Replacing the /-congestion by the 

appropriate ip{A) quantities from Lemma 4.4 then gives the results. However, the relative entropy 
case requires more work. This is because Vr > 1/2 both 1 — Ca(i-a)ii") = 1 — C(j(i_a)(l/2) and 
1 - ^.A(T^(^) = 1 - ^.A(T^(V2), while 1 - Caiog{i/a){r) 7^ 1 - C„iog(i/„)(l/2) when r > 1/2. 
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From Theorem 2.6, it follows that if g{a) = min{l + log ^-^(1 + log 2(i-a) )} ^^^^ 

D(P"(a;, Oik) < E„log < E„5(7r(5„)) . 

By Theorem 2.11, and the relation Cag(^a){f) = Qg(a)(l/2) for r > 1/2 (since ag{a) = (1 — a)g{l — a)), 
the mixing time is then bounded by 



roie) < 



2dr ^ 21og. 



r(l + log(l/2r))(l - Cagia){r)) 1 - C„,(„)(l/2) 



Consider set A C F with Tr{A) <r < 1/2. Then Tr{A)g{Tr{A)) = Tr{A){l + log 3^) and ag{a) < 
a{l + log ^) Va G [0, 1], and so 

1 - Cag{a) (^) > 1 - Ca(l+log(l/2a) (^) 

_ log(l/7r(^)) 

" l + log(l/27r(^)) ^ ^alog(l/a)i^Jj 

^ log(l/r) V'modl?^) 

- l + log(l/2r) log(l/r) ■ 
Substituting this into the bound on r£)(e) given above completes the proof. □ 
Remark 4.6. A straightforward generalization of work in [13] can be used to show that 

MA) > l^modiA) > 1 - C^{A) > i^+(A) > i^(A)2 . 

Hence, these various V-'(^) quantities are closely related to each other, and to modified conductance. 
Remark 4.7. For a lazy walk a useful interpretation oftp~^{A) is given in [6]: 

V^+(A)> sup min ^^^^A5^'^ >-<^\A). 
7r(5)<A 

When combined with Lemma 4-4 it follows that 

1 - C rr, — ^{A) > - sup mm , ' > -^^(A) . 

V«(i-«)^ ^ - 4 x<n(A) SCA, 7r(yl)27r(^^)2 "8 ^ ^ 

7r(5)<A 

This can be interpreted as follows. Let A denote the maximal size of a "blocking set", such that if 
any set S smaller than this is blocked from transitioning then it does not block too much of the ergodic 
flow Q{A, A^). For instance, Q{A \ S, A") = Q{A, A") - Q{S, A") > Q{A, A^) - A/2, and so by setting 
A = Q{A,A'^) then the first lower bound on tp~^{A) implies the second. 

Example 4.8. When there is a bottleneck at a small set then the mixing time can in fact be slower 
than total variation mixing time. In this case the difference between Theorem 4.3 and Corollary 4.5 
may be real, and not simply an artifact of the method of proof. As an example of this, let us consider 
a complete graph Km on m vertices, and attach an additional vertex v by a single edge. We study 
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the lazy max-degree walk given by choosing a neighboring vertex with probability l/2m each, and 
otherwise do nothing. 

First, bound If ^ = {v} then let A = vr({D}) = The only set 7r{S) < A is 5 = 0, and 

so ^P+{{v}) > ^QJgj^-^ =^.UA^ {v} then <^{A) > 1/8, and so ^P+{A) > ^. 

To bound mixing via Blocking Conductance, note that /i+(r) < ^ if r < 2(m+i) ' while h'^(r) < ^ 
otherwise. Then, by Theorem 4.3, 

TTvie) = 0(m log(l/e)) , 

which is of the correct order. 

For Evolving Sets, we can only say that h'^{r) < ^ for all r. Then, by Corollary 4.5, 

T2{e) = 0(mlog(m/e)) 

which is again of the correct order. 

5 Examples 

The purpose of this section is to demonstrate sharpness of our bounds. We start with the elementary 
example of a walk on a complete graph, in which each bound is either sharp or at least asymptotically 
of the correct order. This is followed by a careful analysis of random walk on a cycle, in which we 
show fairly sharp total variation mixing time bounds. We finish by discussing the simple random walk 
on a directed Eulerian graph, for which our methods appear to give the first proof of a mixing time 
bound. 

First, we see that the conductance bounds are sharp. 

Example 5.1. Consider the uniform two-point space {0, 1} with transition kernel P(0, 0) = P(l, 1) = 
7 e [0, 1] and P(0, 1) = P(l, 0) = (1 - 7). Then 4>(^) = 2(1 - 7), and so by equations 15 and 16, 

2(l-7)>l-C^^>2min{7,l-7}- 
Hence 1 - C^/^^ = 2(1 - 7) if 7 > 1/2- 

More generally, 1 — C^y^^^jZa) ^0 = 2 min{7, 1 — 7} and so the upper and lower bound are equal 
and 1 - = 2min{7, 1 - 7} for all 7 G [0, 1]. 

Theorem 3.5 can lead to sharp bounds, even for holding probability under 1/2. 

Example 5.2. Consider the random walk on the complete graph K^n with P(x,y) = 1/m. Then 
WAcV : 4){A) = 1 and so 1 > 1 - C^y^^j::^{A) > 1 - Vl - 1^ = 1. Moreover, when Tr{A) = 1/2 then 

1 > 1 - Ca{i-a){A) > 1 and 1 > 1 - C„iog(i/a)(-4) > (2 log 2)-^ 0.72. Therefore at least two of the 
three bounds in Theorem 3.5 can be sharp. 

Rescaling can be used to extend this to sharp bounds for other holding probabilities. The argument 
used to show equation (14) applies to any value of 7 G [0,1], as long as "^{A) = Q{A,A'^). In 
particular, if 7 > 1/m then the walk on with P(x,x) = 7 and P{x,y) = Vy 7^ a; satisfies 

*(A) = 7r(A)7r(A^)^(l - 7) = Q{A,A''). Hence, by equation (14), if 7 = 1/m, and P' is the walk 

l-C / , (A) 

with holding probability 1/2, then l-C ^^__(A) = ^a-i/m) = 2(i-i/m) • generally, if 

7 > 1/m then 

1 - C n^i^) = 2(1 - 7)(1 - C rj^{A)) = (1 - 7) • 
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In fact, the /-congestion can be used to show sharp mixing time bounds, regardless of holding 
probability. 

Example 5.3. Given a € [—:;^^, 1] consider the walk on Km with P{x,y) = (1 — a)/m for all y ^ x 
and P(x,x) = a + (1 — a)/m, that is, choose a point uniformly at random and move there with 
probability 1 — a, otherwise do nothing. 

The n step distribution is P"(x,x) = ^ + oc" {'^ - ^) and P^{x,y) = ^ - ^ for all y 7^ x. 

Therefore, when a G [0, 1] then D(P"(x, = (1 + Om(l))Q!" logm as m — 00. When a G 

in ||P'*(x,-) -ttIIt- 
Now for evolving 
If a G [0, 1] then 



then ||P'*(x, •) - ttWtv = |ar(l - l/H and ||P"(ar, •) - ^{{^2^^) = \a\''^/m-l. 
Now for evolving sets. 



-1 



m-l' 



if u G (a + (1 - a)7T{A), 1] 

Tr{A) iiu£{{l-a)Tr{A),a + {l-a)Tr{A)] 

1 if u G [0, (1 - a)Tr{A)] 



A quick calculation shows that = Ca\og{i/a) = C = and so Theorem 2.10 implies 

||P"(a;,-) -ttWtv < a" (1 - 1/m), D(P"(x, Oik) < a" logm and ||P"(x,-) - 7r||i2(^) < a"V^n^. 
Total variation and bounds are correct, while relative entropy is asymptotically correct. 
When a G 



^, then 



if u> {1- a)7r(A), 
7r(^'^) ?/u>a + (l-a)7r(^), 

1 otherwise 



This time = C^y^^^jZa) ~ ~^ and so ||P"(x, •)— 7r||ry < (— (1— and ||P"(x, •)~'''"llL2(7r) < 

(— a)*\/"^ — 1) both exact. 

A harder walk to bound is the simple random walk on the cycle Cm, that is P(x, x ±1) = 1/2. A 
bound must distinguish between the (periodic) walk on a cycle of even length, and the (convergent) 
walk on a cycle of odd length. 

Example 5.4. The walk on a cycle Cm of even length has ^ = because it is bipartite, with the 

worst set A given by choosing m/2 alternating points around the cycle, and B = A in the definition 
of "^{A). Therefore = (p>l — Cf>0 for all of the quantities dealt with in Theorem 3.5. Correctly, 
none of our bounds show mixing. 

Now for the cycle Cm of odd length. If Tr{A) < 1/2 then '^{A) > l/2m, with the worst sets given 
by points alternating around the cycle, as in the white vertices of Figure 2. Then ^{A) = Q{A,B) 
when B contains those points at least distance two from A, one point adjacent to these and A, and 
the points in A, corresponding to the circled regions in Figure 2. 

Therefore 

1 - C.„^.,(^) > 4HAMAMAn > 
By Theorem 2.11 it follows that if e > 1/2 then 
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Figure 2: Let A be the white vertices and B be the circled points. Then ^'(A) = Q{A,B) = l/2m. 



and so if x G y then 

1 , ^2 1 

||P"(x,-) -TrllTy < 1 Vl + 2n ii n < — - - . (19) 

m 8 2 

Standard techniques give poor bounds for large epsilon, such as e > 1/2 above. 

Bounds for e < 1/2 can be obtained similarly, but better asymptotics can be derived by a slight 
modification of the argument. Observe that 

||P"(x,-)-7r||Ty < -^E7r(5„)(l-7r(5„)) < "^^l;^ "/"^^ Esin(3.147r(5n)) 

7r(x) 7r[x) sm(3.147r*) 

^ sin(3.147r(x)) 7r,(l-7r,) <Ci_7r)C" 
tt{x) sin(3.147r*) "^"('^"^ " ^ *^ 

where 3.14 is used to represent the number tt. The choice of Csin(7ra) is because if > C for some 
constant C then Cf is minimized by /(a) = sin(7ra) (see [11] for details). 

Now, when pA_ = 1/2 then Lemma 3.4 and equation (10) imply that Csin(7ra)(A) ^ cos(27r^'(A)). 
On the cycle, ii A C V then 7r(A„) > ■k{A) when u < 1/2, while 7r(A„) < Tr{A) when it > 1/2, so 
pA = 1/2. Combined with the earlier bound ^{A) > l/2m it follows that Csin(7ra)(A) < cos(7r/m). 
Then 

||P"(a;, •) - ttWtv < (1 - 7r*)Cl(,„) = (1 - 1/m) cos^{n/m) . (20) 
A fairly close lower bound holds as well. Let Xmax = niax{A2, lA^I} be the second largest mag- 
nitude of an eigenvalue of P. It is easily verified that cos ( j^^"^^^ ^ is an eigenvalue with eigenvector 

f{j) = cos gQ x^^^ > cos ( j^"^^^ ^ = cos(7r/m). But then 

max ||P-(x, •) - ttWtv > \ Kaa. > I cos"(7r/m) . (21) 

The first inequality is a general bound for time-reversible chains. 
The closest bound we have found in the literature is 

I cos"(7r/m) < max ||P"(x, •) - ttWtv < e-'^'"/^"^' ifn> m'^/40. 

Our bound (20) is at most (1— l/m)e~'^^"'/^™^, mildly better overall and with no conditions on n. The 
old bound also required knowledge of the complete spectrum of the transition matrix. In contrast, we 
required only examination of edge expansion properties. 
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Wc finish with an example where our methods give the only known mixing time bounds, the simple 
random walk on a directed Eulerian graph. 

Example 5.5. Consider a directed Eulerian graph with vertex set V and m edges, that is, a strongly 
connected graph with in-degree=out-degree at each vertex. The simple random walk is a walk which 
chooses a neighboring vertex uniformly and then transitions there. This walk has P{x,y) = l/deg{x) 
if there is an edge from x to y, and stationary distribution 7t{x) = deg{x)/m. It is known that 
the lazy simple random walk (i.e. P{x,x) = 1/2 and P{x,y) = \/2deg{x)) has mixing time T2(e) = 
0(m^ log (m/e)), but nothing is known about the non-lazy simple random walk even on undirected 
graphs. 

Before stating a mixing bound we must exclude graphs on which the simple random walk does not 
converge. For instance, a bipartite graph. More generally, the walk is non-convergent if a directed 
graph has k (equal sized) components such that a transition starting in component i always goes 
to component i + 1 mod k. The problem here is that a walk starting in one component has a 
neighborhood the same size as the original set, so it never grows to cover the entire space. If we let 
N{A) = {x ^ V : Ql{A,x) > 0} denote the neighborhood of A, then the following weak expansion 
condition will suffice to rule out such situations: 



C V, Tr{A) < 1/2, yveV: Tr{N{A) \v) > Tr{A) 



(22) 



This just says that if any single vertex in the neighborhood of A is removed, then the neighborhood 
is still at least as big as A. Note this cannot be satisfied if some vertex has only one outgoing edge, 

and so tt^, = min^^y tt{v) > 2/m. 

We now lower bound ^'(A). If A C V with tt{A) < 1/2, and if ^'(A) = Q{A, B) + (7r(^^) - 
7r(5))^^, then either N{A) n 5 / or N{A) C 5^ If N{A) C then 7r{N{A) \v) < 7r(S'= \v) = 
1 — Tr{B Uv) < 7r(A), contradicting the expansion condition. Hence, N{A) Ci B ^ and so there are 
vertices x e A, y e B with P{x,y) > 0. Then 



^{A) > Q{A,B)>7r{x)P{x,y) 



deg{x) 



1 

m 



It follows that d>(r) > 



m deg{x) 

mr{i-r) ^ — ■'"Z^' ^° from the convex version of equation (11) that 



T2(e) < 



■ ^1/2 
J 2/m 



dr 



< 



2/m 2r(l - r)4>{ry/2 

rn? ^ 1 

l2 



+ 



1 



dr 



1/2 2r(l-r)(^V2 



The same argument can be used to improve on the classical T2(e) = 0(m^ log(m/e)) bound for 
the lazy simple walk. Every lazy walk has ^{A) = Q{A,A'^), and so ^(v4) > l/2m even without the 
expansion condition. It follows that ^(r) > 2mr(i-r) ' ^'^'^ ^&ZY simple random walk mixes in 



T2{e) < 



m 



m,^ 1 

1 log - 

3 2 ^ e 



Note that the (lazy or non-lazy) simple random walk on a cycle with an odd number of vertices 
has r2(e) = Q{m? log ^), and so even for the lazy simple random walk our bounds are the first ones of 
the correct order. 
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A total variation bound can be found by integrating the appropriate total variation result of 
Theorems 2.11 and 3.5. Instead, to give a taste of what improvements can be made, we note that in 
[12] the above technique is sharpened to show that the (non-lazy) simple random walk satisfies 



1 



log 



1 - 2/m 



-log cos ^ 



e 



This bound is exact for the simple random walk on a cycle with 3 vertices (i.e. with a = —1/2 in 
Example 5.3), while more generally equation (21) shows an extremely close lower bound for a cycle 
with an odd number of vertices: 



Numerous other improvements and generalizations are possible. See [12] in which we sharpen this 
analysis further, extend it to show bounds on other walks such as the max-degree walk, and also give 
near-optimal bounds for spectral gap and other quantities of interest. 
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Appendix 

Remark 2.13 used the following: 
Lemma 5.6. 

'-alog(l/o) *-'a(l— a) 

{A)). 

Proof. By Lemma 4.4 it suffices to show that 'iT{A)'K{A'^)'ipgi{A) < ijjjnod{A)/ log{l/Tr{A)), that is, 

Jo \tlogil/n{A)) ) y ^ ) - 

If t < TTiA) then nHiSMfe ^ 1' ^"^^ "° ^0^^^ { no\~ii/%) " l) ^iA',t)dt > 0. Moreover, when 
t > Tr{A) then ^'(A'^,t) is a positive decreasing function, and so 

( ^'^-''}^\^ -i)^{A',t)dt> C ( ^^~''}f^,^ -l]^{A',to)dt = 
A(^) Vilog(l/7r(^)) ; ^ -A(^) Vtlog(lM^)) J y 

when to is the solution to the equation t„ iog(i/^(A)) ~ 1 = 0- D 

Lemma 5.7 is a non-lazy, non-reversible extension of a result of [13], and so it was left for the 
Appendix. 

Lemma 5.7. Given a finite ergodic Markov chain and AdV then 



/ (t-7r(A„))dn ift<'K{A) 

Jw{t) 

/ {7r{A^)-t)du ift>Tr{A) 
wo 



*(A^t) = < 

where vu(t) is any value satisfying inf{?/ : n{Ay) < t} < vu(t) < sup{y : 7r{Ay) > t}. 
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Proof. We consider only the case that t < tt{A), as the case when t > Tr{A) is similar. 

By definition, if v, G and ^ A^ then > and equivalently < 

Hence, if TT{A^(^f-j) = t then B = is the same set where the minimum occurs in the definition of 

'if{A'^,t). If instead 7r(^^(j)) > t, then B = U„>^(f)^u is the set where the minimum occurs in the 
definition of '^{A'^,t), and if v is any vertex in ^^(t) \ B then Q{A,v)/Tr{v) = w{t). In both cases 



*(^^t) = Q(^^B) + (t-7r(S)) 



QiA',v) 
7r{v) 



Let B be as defined in the previous paragraph. Then, B C and Au C B whenever u> w{t), 
and so 

{t - 7r{Au)) du = t{l - wit)) - J2 - wit)) TTiy) 
Jw(t) fri, V 7r(y) J 



y&B 

= t{l-w{t))-{Q{A,B)-w{t)TT{B)) 
= il-wit)){t-7r{B)) + Q{A'',B) 

Tr[v) 



The first equality is because J,^ Tr{Au) du = "^y^A^iProbiy G Au) — x)TT{y). The third equality uses 

l-^ = l-^i)byour 



Q{A,B) = Tr{B) - Q{A'',B). The fourth equality is because 



choice of v and w{t). 

The inequalities in the following two lemmas were used in proving Theorem 3.5. 
Lemma 5.8. If x e (0, 1) and y e [0,1 — x) then 

g{x, y) = {x + y) log — \- (1 - x - y) log — ; — > 



□ 



X 



1-x 



Proof. Start by seeing what can be shown by differentiation. 

dx ^("'"x) X ^( 1 — x^ 1 

d^g 2 (•T + y)^^ + (l-a;)^(l-(x + y)) 



dx^ 



- x)'^{x + y){l -{x + y)) 



> 



dg_ 
dx 



x=(l-y)/2 



< and 



dg_ 
dx 



x=l/2 



> then 



It follows that g{x, y) is convex with respect to x, and since 

the minimum occurs at some a; G [(1 — y)/2, 1/2]. 

To lower bound the minimum wc first lower bound g{x, y). By the inequality f{z) = z log z + (1 
z) log(l - z) > - log2 + 2{z - 1/2)2 ^ g 1] it follows that 



9{x, y) = fix + y)- log(l -x) + ix + y) log 



1 — x 



> hix,y) 
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where h{x, y) = - log 2 + 2(x + y - 1/2)^ - log(l - x) + {x + y) log Now, 

dh f, I \ ^ l-x 
-y- = y M T + 4a; + log ■ 



dx'^ 



dx \ x{l — x) J X 

d^h _ l-2x ^ 1 

dx — .t)^ x{1 — x) 

x={l-cy)/2 (l-c2j/2)2 



The second derivative is positive when c G [0, 1], and so h{x, y) is convex in x when x G [(1 — y)/2, 1/2]. 
However, ^|,^.^;^/2 ^ ^ ^"^^ ^(a^i^) > ^(1/2,2/) = when x G [(1 - y)/2, 1/2]. 

It follows that g(x,y) > min gix,y) > min h(x,y) > h(l/2,y) = 2y^. □ 

a;e[{l-j/)/2,l/2] a;e[(l-j/)/2,l/2] 

Lemma 5.9. IfX,Ye [0, 1] i/ien 

Y) = VXY + v/(l -X){l-Y)<y/1-{X- y)2 . 

Proof. Observe that 



giX, Yf = 1^{X + Y) + 2XY + ^^[1 - {X + Y) + 2 X y]2 - [1 - 2(X + y) + (X + y)2] . 

Now, \/y42 - S <A-5if^2>S,A< and A > B (square both sides to show this). These 
conditions are easily verified with A = 1 - {X + Y) + 2 X Y and B = 1 -2{X + Y) + {X + Yf , and so 

g{X,Yf < 2[1-{X + Y) + 2XY]~ [1-2{X + Y) + {X + Yf] 
= 1 + 2XY-X'^-Y'^ = l-{X-Yf 

□ 
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